Search CORE

8 research outputs found

Recommended from our members

Uncertainty in Deep Learning with Implicit Neural Networks

Author: Ratzlaff Neale
Publication venue: 'Oregon State University'
Publication date
Field of study

The ability to extract uncertainties from predictions is crucial for the adoption of deep learning systems to safety-critical applications. Uncertainty estimates can be used as a failure signal, which is necessary for automating complex tasks where safety is a concern. Furthermore, current deep learning systems do not provide uncertainty estimates, and instead can assign high probability to incorrect predictions. To mitigate this problem of overconfidence, this dissertation proposes three approaches that leverage the uncertainty within a distribution of models. Specifically, we consider the epistemic uncertainty given by an approximation to the posterior over model parameters. Prior work approximates this posterior by utilizing analytically known distributions, which are inflexible and result in underestimation of the uncertainty. Instead, we propose to use implicit distributions, which are computationally efficient to sample from, and are flexible enough to parameterize a wide range of distributions. The contributions of this thesis show that implicit models enable better uncertainty estimates than prior work, and can be used for open-category prediction, adversarial example detection, and exploration in reinforcement learning. We begin by showing that implicit generative models with feature-space regularization can be used in the open-category setting to detect input distribution shift, while retaining accuracy on training data. Next, we refine our approach by explicitly encouraging diversity within samples with particle-based variational inference. The uncertainty given by these diverse models is used for exploration in reinforcement learning. We show that in the model-based setting we can leverage uncertainty as a novelty signal, compelling exploration to poorly understood areas of the environment. Third, we turn to the fundamental problem of approximate Bayesian inference. We develop a framework for generative particle-based variational inference that allows for efficient sampling, places no restrictions on the approximate posterior, and improves our ability to estimate epistemic uncertainty

ScholarsArchive@OSU

Recommended from our members

Methods for Detection and Recovery of Out-of-Distribution Examples

Author: Ratzlaff Neale J.
Publication venue: 'Oregon State University'
Publication date
Field of study

Deep neural networks currently comprise the backbone of many applications where safety is a critical concern, for example: autonomous driving and medical diagnostics. Unfortunately these systems currently fail to detect out-of-distribution (OOD) inputs and can be prone to making dangerous errors when exposed to them. In addition, these same systems are vulnerable to maliciously altered inputs called adversarial examples. In response to these problems we present two methods to handle out-of-distribution inputs, as well resist adversarial examples, respectively. \\ To detect OOD inputs, we introduce HyperGAN: a generative adversarial network which learns to generate all the parameters of a deep neural network. HyperGAN first transforms low dimensional noise into a latent space, which can be sampled from to obtain diverse, performant sets of parameters for a target architecture. By sampling many sets of parameters, we form a diverse ensemble which provides a better estimate of uncertainty than standard ensembles. We show that HyperGAN can reliably detect OOD inputs as well as adversarial examples.\\ We also present a method for recovering clean images from adversarial examples. BFNet uses a differentiable bilateral filter as a preprocessor to a neural network. The bilateral filter projects inputs back to the space of natural images, and in doing so it removes the adversarial perturbation. We show that BFNet is an effective defense in multiple attack settings, and is able to provide additional robustness when combined with other defenses

ScholarsArchive@OSU

A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems

Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future.Comment: To appear in Neural Network

arXiv.org e-Print Archive

Loughborough University Institutional Repository

A domain-agnostic approach for characterization of lifelong learning systems

Author: Abrar Rahman (14607977)
Alexander New (14607917)
Andrea Soltoggio (1248822)
Andrew P Brna (12320666)
Angel Yanguas-Gil (2161717)
Anurag Daram (12320672)
Aswin Raghavan (14607974)
Cassandra Kent (14607956)
Christine Piatko (14607971)
Dhireesha Kudithipudi (12320660)
Eric Eaton (14607941)
Eric Q Nguyen (14607968)
Erik Learned-Miller (69776)
Eseoghene Ben-Iwhiwhu (6115286)
Ethan Brooks (14607926)
Fabien Delattre (14607935)
Felix Wang (12320723)
Gautam K Vallabha (14608001)
George Konidaris (14607959)
Haotian Fu (14607944)
Harel Yedidsion (14607995)
Indranil Sur (14607986)
Jesse Hostetler (14607950)
Jorge A Mendez (14607965)
Kristen Grauman (14607947)
Kyle Vedder (14607992)
Mario Aguilar-Simon (12320663)
Megan M Baker (14607914)
Michael L Littman (14607962)
Neale Ratzlaff (14607983)
Nicholas Ketz (418685)
Peter Stone (62420)
Praveen K Pilly (8612634)
Ryan C Brown (14607929)
Ryan Dellana (14607938)
Saket Tiwari (14607989)
Sandeep Madireddy (12320687)
Santhosh Kumar Ramakrishnan (14607980)
Seungwon Lee (3768508)
Shangqun Yu (14607998)
Shariq Iqbal (14607953)
Soheil Kolouri (8612628)
Sébastien MR Arnold (14607923)
Zachary Daniels (14607932)
Zhipeng Tang (1937923)
Ziad Al-Halah (14607920)
Zifan Xu (13805933)
Publication venue
Publication date: 20/01/2023
Field of study

Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to “real world” events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of “Lifelong Learning” systems that are capable of (1) Continuous Learning, (2) Transfer and Adaptation, and (3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development — both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future

Loughborough University Institutional Repository